upup-ashton-wang-usc

Upup-ashton-wang's group workspace

Group: Resa - Main Models

1-9

of 9

Tags

Notes

Author

upup-ashton-wang

State

Finished

Start time

August 7th, 2025 7:19:15 AM

Runtime

23m

Tracked hours

Run path

upup-ashton-wang-usc/Resa/fp7ukf74

Linux-4.18.0-553.22.1.el8_10.x86_64-x86_64-with-glibc2.28

Python version

CPython 3.10.16

Command

/home1/shangsha/workspace/reasoning/reasoning-sae/./scripts/train/sae_tuning.py --config ./recipes/DeepSeek-R1-Distill-Qwen-1.5B/grpo/sae_tuning.yaml --base_model_name DeepSeek-R1-Distill-Qwen-1.5B --source_model_post_train_dataset_name still --source_model_post_train_type grpo --source_model_checkpoint checkpoint-0 --sae_name sae-DeepSeek-R1-Distill-Qwen-1.5B-65k --sae_hookpoint model.layers.12 --trigger_dataset_name deepscaler --sae_type trained_from_scratch --target_model_name DeepSeek-R1-Distill-Qwen-1.5B --elicitation_dataset_name deepscaler

System Hardware

CPU count	64
Logical CPU count	64
GPU count	2
GPU type	NVIDIA L40S

W&B CLI Version

0.19.9

Group

Resa - Main Models

Config parameters are your model's inputs. Learn more

▶
Config parameters:{} 20 keys
- base_model_name:
  "DeepSeek-R1-Distill-Qwen-1.5B"
- batch_size:
  1
- elicitation_dataset_name:
  "deepscaler"
- learning_rate:
  0.000001
- logging_steps:
  1
- lora_alpha:
  128
- lora_dropout:
  0.05
- lora_r:
  32
- ▶
  lora_target_modules:[] 7 items
- num_epochs:
  2
- sae_hookpoint:
  "model.layers.12"
- sae_name:
  "sae-DeepSeek-R1-Distill-Qwen-1.5B-65k"
- sae_type:
  "trained_from_scratch"
- save_steps:
  500
- seed:
  42
- source_model_checkpoint:
  "checkpoint-0"
- source_model_post_train_dataset_name:
  "still"
- source_model_post_train_type:
  "grpo"
- target_model_name:
  "DeepSeek-R1-Distill-Qwen-1.5B"
- trigger_dataset_name:
  "deepscaler"

Summary metrics are your model's outputs. Learn more

▶
Summary metrics:{} 5 keys
- epoch:
  1
- epoch_kl_loss:
  200.14176346356916
- learning_rate:
  0.000001
- sae_reconstruction_loss:
  4.3125
- step_kl_loss:
  138

This run produced these artifacts as outputs. Total: 1. Learn more

wandb-history

run-fp7ukf74-history:v0